Generalization to Unseen Cases
نویسندگان
چکیده
We analyze classification error on unseen cases, i.e. cases that are different from those in the training set. Unlike standard generalization error, this off-training-set error may differ significantly from the empirical error with high probability even with large sample sizes. We derive a datadependent bound on the difference between off-training-set and standard generalization error. Our result is based on a new bound on the missing mass, which for small samples is stronger than existing bounds based on Good-Turing estimators. As we demonstrate on UCI data-sets, our bound gives nontrivial generalization guarantees in many practical cases. In light of these results, we show that certain claims made in the No Free Lunch literature are overly pessimistic.
منابع مشابه
Artificial Intelligent Modeling and Optimizing of an Industrial Hydrocracker Plant
The main objective of this study is the modelling and optimization of an industrial Hydrocracker Unit (HU) by means of Adaptive Neuro Fuzzy Inference System (ANFIS) model. In this case, some data were collected from an industrial hydrocracker plant. Inputs of an ANFIS include flow rate of fresh feed and recycle hydrogen, temperature of reactors, mole percentage of H2 and H2S, feed flow rate and...
متن کاملLong Short-Term Memory for Speaker Generalization in Supervised Speech Separation
Speech separation can be formulated as learning to estimate a time-frequency mask from acoustic features extracted from noisy speech. For supervised speech separation, generalization to unseen noises and unseen speakers is a critical issue. Although deep neural networks (DNNs) have been successful in noise-independent speech separation, DNNs are limited in modeling a large number of speakers. T...
متن کاملLocalized Generalization Error of Gaussian-based Classifiers and Visualization of Decision Boundaries
In pattern classification problem, one trains a classifier to recognize future unseen samples using a training dataset. Practically, one should not expect the trained classifier could correctly recognize samples dissimilar to the training dataset. Therefore, finding the generalization capability of a classifier for those unseen samples may not help in improving the classifiers accuracy. The loc...
متن کاملZero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
As a step towards developing zero-shot task generalization capabilities in reinforcement learning (RL), we introduce a new RL problem where the agent should learn to execute sequences of instructions after learning useful skills that solve subtasks. In this problem, we consider two types of generalizations: to previously unseen instructions and to longer sequences of instructions. For generaliz...
متن کاملCommunicating Hierarchical Neural Controllers for Learning Zero-shot Task Generalization
The ability to generalize from past experience to solve previously unseen tasks is a key research challenge in reinforcement learning (RL). In this paper, we consider RL tasks defined as a sequence of high-level instructions described by natural language and study two types of generalization: to unseen and longer sequences of previously seen instructions, and to sequences where the instructions...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005